VoiceCard — Multilingual Voice Introduction Generator

You give it an API key + config. It gives you a ready component.

What It Actually Is

A CLI tool + generator that takes:

Your TTS provider API key (ElevenLabs, OpenAI, etc.)
A single voice.config.ts file
Your intro text per language (or just one language + auto-translate)

And outputs:

A self-contained React component (VoiceCard.tsx) — paste into any project
A self-contained Svelte component (VoiceCard.svelte) — paste into any project
Pre-generated audio files (fetched from the API and saved locally)
Zero runtime API calls — audio is baked in at generation time

The visitor's browser never touches your API key. Never calls any external API. Just plays a local audio file.

The Mental Model

You run:   npx voicecard generate

It does:
  1. Reads voice.config.ts
  2. Calls your TTS API with each language's text
  3. Downloads + saves audio files to /public/voicecard/
  4. Writes VoiceCard.tsx  (React)  ← ready to paste
  5. Writes VoiceCard.svelte        ← ready to paste

You get:   <VoiceCard />  — drop it anywhere, works offline

No runtime server. No proxy. No API key exposure. Just static files + a component.

Usage: End-to-End in 4 Steps

Step 1 — Install

npm install -g voicecard
# or use without installing:
npx voicecard

Step 2 — Create your config

voice.config.ts in your project root:

import { defineConfig } from 'voicecard';

export default defineConfig({
  // --- Who you are ---
  owner: {
    name: 'Enes Yeşil',
    photo: '/images/enes.jpg',   // optional, shown in card UI
  },

  // --- Your TTS provider ---
  provider: {
    name: 'elevenlabs',
    apiKey: process.env.ELEVENLABS_API_KEY,   // from .env — never committed
    voiceId: 'your-cloned-voice-id',          // your voice clone ID
  },

  // --- Your intro text ---
  // Option A: Write each language manually
  languages: {
    en: {
      text: `Hi, I'm Enes. I build tools at the intersection of design and engineering.
             I care about making complex things feel simple.`,
      label: 'English',
      flag: '🇬🇧',
    },
    tr: {
      text: `Merhaba, ben Enes. Tasarım ve mühendisliğin kesişiminde araçlar üretiyorum.
             Karmaşık şeyleri sade hissettirmek benim için önemli.`,
      label: 'Türkçe',
      flag: '🇹🇷',
    },
    de: {
      text: `Hallo, ich bin Enes. Ich entwickle Tools an der Schnittstelle von Design und Engineering.`,
      label: 'Deutsch',
      flag: '🇩🇪',
    },
  },

  // --- Option B: Write one language, auto-translate the rest ---
  // autoTranslate: {
  //   source: 'en',
  //   targets: ['tr', 'de', 'fr', 'ja', 'zh', 'ar', 'es', 'pt'],
  //   provider: 'deepl',   // or 'openai', 'google'
  //   apiKey: process.env.DEEPL_API_KEY,
  // },

  // --- Output ---
  output: {
    audioDir: 'public/voicecard',           // where audio files are saved
    componentDir: 'src/components',         // where the component is written
    frameworks: ['react', 'svelte'],        // which components to generate
  },

  // --- UI ---
  ui: {
    theme: 'card',                          // 'minimal' | 'card' | 'floating'
    trigger: 'button',                      // 'button' | 'auto'
    defaultLanguage: 'en',
    showTranscript: true,
    showLanguageSwitcher: true,
  },
});

Step 3 — Generate

npx voicecard generate

Output in terminal:

✓ Fetching audio: en  →  public/voicecard/intro-en.mp3   (28KB)
✓ Fetching audio: tr  →  public/voicecard/intro-tr.mp3   (24KB)
✓ Fetching audio: de  →  public/voicecard/intro-de.mp3   (31KB)
✓ Writing component:  →  src/components/VoiceCard.tsx
✓ Writing component:  →  src/components/VoiceCard.svelte

Done. Drop <VoiceCard /> into your page.

Step 4 — Drop it in

React (Next.js, Vite, Remix, etc.):

import VoiceCard from '@/components/VoiceCard';

export default function Hero() {
  return (
    <main>
      <VoiceCard />
      <h1>Enes Yeşil</h1>
    </main>
  );
}

Svelte (SvelteKit, etc.):

<script>
  import VoiceCard from '$lib/components/VoiceCard.svelte';
</script>

<VoiceCard />
<h1>Enes Yeşil</h1>

That's it. No config props to pass. Everything is baked into the generated component.

What the Generated Component Contains

The generated file is fully self-contained and readable. No magic. No opaque blob.

// VoiceCard.tsx — generated by voicecard on 2026-03-09
// DO NOT EDIT — re-run `npx voicecard generate` to regenerate

const VOICECARD_CONFIG = {
  owner: { name: 'Enes Yeşil', photo: '/images/enes.jpg' },
  defaultLanguage: 'en',
  languages: {
    en: { label: 'English', flag: '🇬🇧', audio: '/voicecard/intro-en.mp3',
          transcript: 'Hi, I\'m Enes. I build tools...' },
    tr: { label: 'Türkçe',  flag: '🇹🇷', audio: '/voicecard/intro-tr.mp3',
          transcript: 'Merhaba, ben Enes...' },
    de: { label: 'Deutsch', flag: '🇩🇪', audio: '/voicecard/intro-de.mp3',
          transcript: 'Hallo, ich bin Enes...' },
  },
  ui: { theme: 'card', trigger: 'button', showTranscript: true }
};

// ... ~150 lines of clean React component code ...

export default function VoiceCard() { ... }

It's a normal component you can read, fork, and modify. The generator just saves you from writing it.

CLI Commands

# Generate components + audio from config
npx voicecard generate

# Only re-fetch audio (config didn't change, just refreshing voice)
npx voicecard audio

# Only regenerate components (audio already exists, UI tweak)
npx voicecard component

# Preview what would be generated without writing files
npx voicecard generate --dry-run

# Validate your config before running
npx voicecard validate

# List all available TTS providers
npx voicecard providers

# Check which languages support your chosen provider's voice clone
npx voicecard languages --provider elevenlabs

# Interactive setup wizard (for first-time users)
npx voicecard init

Supported TTS Providers

All providers are plug-and-play via the provider.name config key.

Provider	Voice Cloning	Languages	Notes
`elevenlabs`	Yes	32	Best quality for cloned voice
`openai`	No (preset voices)	57	Fast, cheap, great quality
`deepgram`	Yes	36	Good alternative to ElevenLabs
`azure`	Yes (Custom Neural)	140+	Most language coverage
`google`	Yes (Custom Voice)	220+	Max coverage, complex setup
`browser`	No	varies	Zero cost, no API key needed

You only ever set one provider. The CLI calls that provider's API during generation. At runtime, the visitor's browser just plays MP3 files.

Auto-Translate Feature

If you don't speak 10 languages but want 10 language versions, the autoTranslate block handles it:

autoTranslate: {
  source: 'en',                               // your master language
  targets: ['tr', 'de', 'fr', 'ja', 'zh', 'ar', 'es', 'pt', 'hi', 'ko'],
  provider: 'deepl',                          // translation API
  apiKey: process.env.DEEPL_API_KEY,
  reviewDir: 'voicecard/translations',        // saves translated text for your review
}

What happens:

Translates your English text to all target languages
Saves translation files to reviewDir — you review them before generating audio
Once you approve (or edit), run npx voicecard generate to produce audio

The review step is intentional. Auto-translated text for your personal intro can be awkward. You want to catch that before it's spoken in your voice.

Project Structure After Generation

your-project/
├── voice.config.ts                  ← you write this
├── .env                             ← ELEVENLABS_API_KEY=... (gitignored)
│
├── public/
│   └── voicecard/
│       ├── intro-en.mp3             ← generated audio files
│       ├── intro-tr.mp3
│       └── intro-de.mp3
│
├── voicecard/
│   └── translations/                ← auto-translated text (for review)
│       ├── de.txt
│       └── fr.txt
│
└── src/components/
    ├── VoiceCard.tsx                ← generated React component
    └── VoiceCard.svelte             ← generated Svelte component

Generated Component: UI Themes

All three themes are included in the generated file. You switch via ui.theme in config then regenerate — or just change the theme prop at runtime if you want to toggle it dynamically.

minimal

[ ▶ Hear my intro ]  🇬🇧 🇹🇷 🇩🇪

Play button + language flags. Fits inline anywhere.

card

┌─────────────────────────────┐
│  [photo]  Enes Yeşil        │
│           ▶ Hear my intro   │
│  ════════════               │  ← waveform progress bar
│  🇬🇧 English  🇹🇷 Türkçe 🇩🇪 │
│  ▼ Read transcript          │
└─────────────────────────────┘

Bio card widget. Good as a hero section element.

floating

                  ┌───────────┐
                  │ 🎙 Listen │  ← bottom-right corner, always visible
                  └───────────┘

Non-intrusive. Expands on click to show full card.

Security Model

This is the right way to think about it:

Generation time (YOUR machine):
  voice.config.ts + API key → CLI → API call → MP3 files saved locally

Runtime (visitor's browser):
  HTML page loads → component plays /public/voicecard/intro-en.mp3
  ← zero API calls, zero key exposure, zero external dependencies

Your API key is only used at generation time, on your machine. It's in .env, gitignored. The output files (MP3s + component) contain no keys.

This also means the component works offline, behind a firewall, or anywhere — because it's just audio files.

Regeneration Strategy

Audio files are expensive to regenerate (API calls cost money + time). The CLI is smart about this:

npx voicecard generate
# Only fetches audio for languages where text changed since last run
# Compares hash of text content against a lockfile: voicecard.lock.json
# Skips unchanged languages
# Always regenerates components (cheap — local operation)

voicecard.lock.json — tracks what was generated:

{
  "generated": "2026-03-09T14:22:00Z",
  "provider": "elevenlabs",
  "voiceId": "abc123",
  "hashes": {
    "en": "sha256:a1b2c3...",
    "tr": "sha256:d4e5f6...",
    "de": "sha256:g7h8i9..."
  }
}

Commit voicecard.lock.json and the MP3 files to git. Don't commit voice.config.ts's API key (use env vars).

Accessibility

Hard requirements, not optional:

Transcript always available as accessible text (visible or toggle-able)
aria-live region announces playback state to screen readers
Full keyboard control: Space to play/pause, Escape to stop, Tab through languages
No autoplay — ever. The trigger: 'auto' config option plays only after a user gesture has occurred on the page
Respects prefers-reduced-motion — animations disabled, no pulsing waveform
Color contrast AA minimum on all themes

`npx voicecard init` — Setup Wizard

For first-time users, the interactive wizard eliminates config guesswork:

$ npx voicecard init

? Which TTS provider do you want to use?
  ❯ ElevenLabs (voice cloning, 32 languages)
    OpenAI TTS (no cloning, 57 languages, cheaper)
    Browser TTS (free, no API key)

? Paste your ElevenLabs API key:  **********************

? Do you have a voice clone, or should we create one?
  ❯ I have a voice clone ID already
    Walk me through creating one

? What frameworks do you need?
  ❯ ✓ React
    ✓ Svelte
    ○ Vanilla HTML

? Which languages do you want? (space to select)
  ❯ ✓ English
    ✓ Turkish
    ○ German
    ○ French
    ○ Japanese
    ...

? Write your intro in English:
  ❯ [text input]

? Auto-translate to selected languages?
  ❯ Yes — I'll review translations before generating audio
    No — I'll write each language myself

✓ Created voice.config.ts
✓ Created .env (with your API key)
✓ Added voicecard/ and .env to .gitignore

Run `npx voicecard generate` when ready.

Roadmap

v1 — Core (what's described above)

CLI generator
ElevenLabs + OpenAI providers
React + Svelte output
3 themes
Auto-translate with review step

v2 — Studio

Browser-based UI to run the generator without a terminal (npx voicecard studio)
Drag-and-drop to record your own voice instead of using a TTS API
Live preview of the component before generating

v3 — Publish

npx voicecard publish — push audio files to Cloudflare R2 or S3 directly
CDN URLs instead of local /public/ paths
Version management (keep old audio versions when you update your intro)

Created: March 2026 — World of Ideas